-
Notifications
You must be signed in to change notification settings - Fork 846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add batch span processor benchmarks #3017
Add batch span processor benchmarks #3017
Conversation
eab438a
to
b96b976
Compare
sdk/trace/src/jmh/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorCpuBenchmark.java
Show resolved
Hide resolved
sdk/trace/src/jmh/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorMetrics.java
Outdated
Show resolved
Hide resolved
|
||
private long getMetric(boolean dropped) { | ||
String labelValue = String.valueOf(dropped); | ||
Optional<Long> value = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a feeling two loops will be very similar code while more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing code is two loop and it took me a good amount of time to debug why dropped/exported spans metrics are invalid and got a bit confused on what the two loops are doing. Functional would make it easy to see what are getting filtered and how the data is getting mapped.
That said, I would leave this to the maintainers.
...e/src/jmh/java/io/opentelemetry/sdk/trace/export/BatchSpanProcessorMultiThreadBenchmark.java
Outdated
Show resolved
Hide resolved
d33590f
to
be485ea
Compare
Description: This PR adds two benchmarks. 1. Current benchmark executs forceFlush() on every loop and creates a bottleneck which results in not stressing batch span processor. Current benchmark only measures throughput which is not helpful on its own since number of spans getting exported is also important. BatchSpanProcessorMultiThreadBenchmark is created to address this issue. 2. Measuring CPU usage of exporter thread is also important, but the current benchmarks consumes as much CPU as possible which makes the measurement not meaningful. To maintain a steady state, this PR creates a benchmark that generates 10k spans per second per thread. One would need to attach a profiler such as yourkit or JProfiler to the benchmark run to understand the processor's CPU usage. BatchSpanProcessorCpuBenchmark is created for this purpose. This PR also fixes a bug in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other one. This PR also fixes a big in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other ones.
be485ea
to
408915f
Compare
Codecov Report
@@ Coverage Diff @@
## main #3017 +/- ##
===========================================
+ Coverage 0 90.72% +90.72%
- Complexity 0 2813 +2813
===========================================
Files 0 324 +324
Lines 0 8774 +8774
Branches 0 883 +883
===========================================
+ Hits 0 7960 +7960
- Misses 0 551 +551
- Partials 0 263 +263
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having fixed-throughput overhead benchmarks is very helpful, thanks!
I finally got to spend some dedicated time on this today, and got jmh hooked up with the async-profiler profiling option, and I do see some significant overhead introduced by the usage of the ArrayBlockingQueue (although it's been tough to get really consistent results). thanks for putting in the work on this. Now, to agree on the best solution to the issue. :) |
Description:
This PR adds two benchmarks.
measures throughput which is not helpful on its own since number of spans getting exported is also important. BatchSpanProcessorMultiThreadBenchmark is created to address this issue.
To maintain a steady state, this PR creates a benchmark that generates 10k spans per second per thread. One would need to attach a profiler such as yourkit or JProfiler
to the benchmark run to understand the processor's CPU usage. BatchSpanProcessorCpuBenchmark is created for this purpose.
This PR also fixes a bug in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other one.